*** How worried should we be? The implications of fabricated survey data for political science
*** Table 2: Demographic composition of clean and fake data

set more off

* set directory to location of dataset in following line
cd "C:\~\Downloads\"

use "VEN_fraud_data.dta", clear

* first drop canceled cases that were not likelyfrauds and matched to clean cases
drop if clean_data == 0 & cem_matched != 1

* relabel gender variable
lab define q1 1 "Male" 2 "Female"
lab values q1 q1

** Table 2 values
* fraud
tab edad q1 if likelyfraud == 1, cell nofreq
* clean
tab edad q1 if clean == 1, cell nofreq
